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Abstract During the recent few years, in response to empirical findings suggesting scale- 
free self- organisation phenomena emerging in complex nervous systems at a mesoscale level, 
there has been significant search for suitable models and theoretical explanations in neu- 
roscientific literature, see the recent survey by Bullmore & Sporns (2009). In Piekniewski 
& Schreiber (2008) we have developed a simple and tractable mathematical model shedding 
some light on a particular class of the afore-mentioned phenomena, namely on mesoscopic 
level self- organisation of functional brain networks under fMRI imaging, where we have 
achieved a high degree of agreement with existing empirical reports. Being addressed to 
the neuro scientific community, our work Piekniewski & Schreiber (2008) relied on semi- 
rigorous study of information flow structure in a class of recurrent neural networks exhibit- 
ing asymptotic scale-free behaviour and admitting a description in terms of the so-called 
winner-take- all dynamics. The purpose of the present paper is to define and study these 
winner-take-all networks with full mathematical rigour in context of their asymptotic spec- 
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is a limit theorem for spectra of the spike-flow graphs induced by the winner-take-all dy- 
namics. We provide an explicit characterisation of the limit spectral measure expressed in 
terms of zeros of BesseVs J-function. 
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1 Introduction and motivations 



Recent few years in the neuroscientific literature have been marked by a very successful 
interdisciplinary interaction between the study of large-scale phenomena in complex ner- 
vous systems and random graph theory, especially in context of the so-called scale-free 
networks considered a nearly classical subject by now, see e.g. Albert & Barabasi (2002) 
or Chung & Lu (2006) and Durett (2007) for a mathematical treatment. Among a plethora 
of particular topics studied, the one in focus of our interest are the statistical properties of 
the so-called functional brain networks arising under fMRI imaging at mesoscale (usually 
understood as individual voxel level) where small world and scale-free self-organisation of 
activity correlations has been reported in empirical findings, see e.g. Bullmore & Sporns 
(2009) for an extensive review and Eguiluz et al. (2005), Salvador et al. (2005), Cecchi 
et al. (2007) and van den Heuvel (2008) for presentation and discussion of experimental 
results. Certain heuristical non-rigorous considerations aimed at explaining these phenom- 
ena have been offered in Fraiman (2009) and Kitzblicher (2009) discussing very interesting 
analogies between crucial features of functional brain networks and Ising model at criti- 
cality. Up to our best knowledge, the first dedicated mathematical model shedding some 
light on the scale-free properties of mesoscopic brain functional networks is the simple spin 
glass type system introduced in Piekniewski & Schreiber (2008) further extended and en- 
hanced with a geometric ingredient in Piersa, Piekniewski & Schreiber (2010) and standing 
in good agreement with empirical findings. The details and neuroscientific motivations of 
these models are far beyond the scope of the present mathematically oriented paper and 
we only provide a brief overview for completeness here, proceeding to well-defined rigorous 
problems as soon as possible. 

The disordered system proposed in Piekniewski & Schreiber (2008) models an asyn- 
chronous spiking neural network with the aim of analysing the structure of information 
flow in a class of recurrent neural nets. The model, bearing formal resemblance to the cel- 
ebrated Sherrington-Kirkpatrick (1972) spin glass yet exhibiting quite different behaviour, 
consists of N formal neurons q, % — 1, . . . ,N, where the value q G {0, 1, 2, ... } represents 
the charge (activity level) stored at q. Initially each neuron stores some small fixed charge. 
The charge-conserving Kawasaki-style evolution of the system takes place by choosing at 
random subsequent pairs of numbers % ^ j and trying to transfer a unit charge from q to 
- as soon as q > such a trial is always successful if it decreases the energy of the system 
and is accepted with probability exp(— /3AH) and rejected with the complementary prob- 
ability otherwise, where /3 > is some positive inverse temperature parameter whereas the 
energy H of the system is given by H := \ w ij\^i~^j\ with Wij = Wji standing for i.i.d. 
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standard Gaussian connection weights. In standard intuitive terms, the presence of a pos- 
itive weight between two neurons indicates that the system favours the agreement of their 
activity levels whereas a negative weight means that disagreement is preferred. The object 
in the focus of our interest in Piekniewski & Schreiber (2008) was the spike flow-graph or 
charge-flow network generated by this dynamics, defined by ascribing to each edge (ij) the 
multiplicity equal to the number of charge transfers occurring along (ij) in the course of 
(a long enough period of) the dynamics. This object gains a natural interpretation upon 
noting that edges with high multiplicities are those essential to the dynamics as designed 
to model the neural network's spiking activity, whereas the low multiplicity edges are only 
seldom used and could as well be removed from the network without effectively affecting 
its evolution. In informal terms, the charge-flow graph represents the essential support of 
the system's effective dynamics, whence our interest in this object. 

In Piekniewski & Schreiber (2008) we have performed a semi-rigorous analysis of the 
above model, based on extreme value theory methods, arguing that for N large enough its 
ground state arises by putting the whole system charge into one best neuron (determined 
as a function of weights Wij) and leaving all the remaining ones empty. Moreover, in 
low enough temperatures, the dynamics of such networks in large N asymptotics is well 
approximated, in the sense made precise ibidem, by a much simpler winner-take- all (WTA) 
dynamics described in detail and rigour in Section 2 below. This observation allowed us to 
show in Piekniewski & Schreiber (2008) that asymptotically the charge-flow networks are 
scale-free with exponent 2, see ibidem as well as Piersa, Piekniewski & Schreiber (2010), 
in agreement with the empirical findings as quoted above. We have also argued there 
that even though the spin glass model we propose may be regarded quite specific, its 
large scale behaviour and in particular its winner-take-all approximation is presumably 
universal for a large class of networks where each formal neuron represents a computational 
unit exhibiting some non-trivial internal structure and memory, for instance a group of 
biological or artificial neurons (see Piekniewski, 2007) whose internal state requires more 
complicated labeling than just { — 1, +1} as in the original Sherrington-Kirkpatrick model, 
whence the N-valued labels in our model. 

The purpose of this paper is to complement the semi-rigorous developments of Piek- 
niewski & Schreiber (2008) by carrying out a fully rigorous mathematical study of the 
asymptotic structure of random charge-graphs generated by the winner-take-all dynamics 
described in full detail in Section 2 below. More precisely, we focus on spectral measures of 
these graphs as providing important information about their underlying structure, see e.g. 
Chapters 8 and 9 in Chung & Lu (2006) for a discussion of spectral aspects of scale-free 
graphs. 
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2 The model and main results 



To provide a formal description of the winner-take- all dynamics, consider the set {1, . . . , n} 
of network vertices, each vertex identified with its rank between 1 and n. Initially are 
m = \_an\, a G R+, units of charge present in the system, with each unit stored in a 
vertex chosen uniformly by random, independently of other units. The system evolves 
thereupon according to the following sequential winner-take- all (WTA) dynamics, with <7j 
standing for the current charge stored at i. 

(WTA) Choose uniformly by random a source vertex i G {1, ... , n} and, independently, 
a target vertex j G {1, . . . , n}. 

• If j < i and oi > then transfer a unit charge from i to j, that is to say set 
Oi := <Ji — 1 and o~j := o~j + 1. 

• If j — i and a-i > then remove a unit charge from i setting Oi :— Oi — 1. 

• If j > i then no update occurs. 

In other words, at each step of the dynamics a charge transfer attempt is made between 
two random vertices, which is succesful whenever the source vertex has a higher rank than 
the target vertex. Whenever a self-transfer is attempted, a unit charge is removed from the 
system (charge leak occurs), although another natural interpretation is that the evolution 
of the charge unit terminates at this point and the charge remains stored forever at the 
vertex considered rather than being removed from the system, which makes (WTA) into a 
charge- conserving dynamics - these interpretational issues, which become important when 
discussing precise technical relationships between the original neural network model and 
its winner-take-all approximation, see Piersa, Piekniewski & Schreiber (2010), fall beyond 
the scope of the present mathematically oriented article. The updates in this dynamics 
are performed until there are no more charge units evolving in the system, that is to 
say (7i = for all i = 1,... ,n. With each instance of such an evolution we associate 
in a natural way its charge-flow network, also referred to as the spike-flow network due 
to its interpretation in the context of spiking neural networks as originally considered in 
Piekniewski & Schreiber (2008). The charge-flow network is an undirected graph with 
multiple edges, where the edge multiplicity A™j m = A"' m between i,j, i > j, is given by 
the number of charge units transferred from i to j in the course of the WTA dynamics. 
Conforming to the usual terminology, the random symmetric matrix (A^' m )jj =li _ ir j will 
be called the adjacency matrix of the charge-flow network in the sequel. Moreover, the 
number of charge transfers away from vertex i, that is to say X^xi^'™' wm be called 
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the out-degree of i and, likewise, the number Yli>j °f charge transfers to vertex j will 
be called its in-degree whereas the sum of out- and in-degree will be called the degree of 
the vertex. It can be shown, see Theorem 1 in Piekniewski & Schreiber (2008), whose 
semi-rigorous proof can easily be brought to full rigour (which falls beyond the scope of 
the present work though), that with overwhelming probability the charge-flow network is 
asymptotically scale free with exponent 2 as n — > oo, that is to say the in- and out-degrees 
of its vertices follow asymptotically a power law with exponent 2, see ibidem for further 
details. 

It is convenient and natural for our further purposes to consider the WTA evolu- 
tions for different values of n, and hence also their corresponding charge flow matrices 
(^4 n ' m ) n >i, m=\_an\ , coupled on a common probability space, say (P, as follows. For 

each n' > n the WTA dynamics on {1, . . . , n} is obtained from that on {1, . . . , n'} by 

• Numbering from 1 to m' — \an'~\ the charge units assigned to vertices in {1, . . . , n'} 
and constructing the restricted initial charge assignment for {1, . . . , n} by assigning 
each among the initial m = \an\ units to the first vertex in {1, . . . , n) it hits in the 
course of its extended evolution in {1, . . . , n'}. 

• Letting the evolution of the m charge units in {1, . . . , n} arise as the restriction of 
the corresponding dynamics of the initial m among the m' charge units in {1, . . . , n'} 
after reaching the set {1, . . . , n}. 

It is clear that this yields a consistent coupling for all n > 1 and all our probabilistic 
statements in the sequel shall assume this coupling without a further mention. Note in 
particular that we have almost surely (A n ' m ) i j < (A a '' m ')ij with n < n',m < m' and 
i,j < n, which allows us to interpret the charge flow graph for n as a subgraph of that for 
n! > n. 

The objects in focus of our interest in the present paper are the (non- normalised!) 
empirical spectral measures of (A™' m ), m = [an\, 



where Ai > A 2 > . . . > A n are the eigenvalues of A n ' m repeated according to their multi- 
plicities, note that all Aj are real numbers because A n ' m is self-adjoint. Clearly, the total 
mass of n n;m is n, but as will be seen in the sequel and as reflecting the power-law scaling 
properties of the charge-flow graph, the random measure /x n>m with arbitrarily high proba- 
bility puts almost all its mass in neighbourhoods of 0, corresponding to the overwhelming 
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majority of low degree vertices, even though the spectral radius of A n,m is asymptotically 
of order 0(n). In fact, we shall show that the mass which \i n jTn puts outside the neighbour- 
hoods of is bounded and that, with n — > oo and m = [an\, the random measures n n ,m 
converge almost surely to a non-trivial limit away from in the sense specified below. 

We say that a sequence ( n of Borel measures on R converges weakly away from zero to 
a Borel measure ( on R iff lim^oo J fd( n = J fd( for all bounded continuous / : R — > R 
which vanish in some neighbourhood of zero. To identify the weak limit away from zero 
for /i n m consider the following trace class operator M : l 2 — > h on the space of square- 
integrable sequences, given by 

oo 

[^K^-)]* = E(7^)2- ( 2 ) 

Observe that M is symmetric and Hermitian positive as corresponding to the covariance 
matrix of Wi/p, i = 1,2,... , with W standing for the standard Brownian motion. To get 
the required trace class property use that 1/i 2 < oo and apply Theorem 2.12 in Simon 
(2005), see also ibidem and Section X.3 in Kato (1976) for general theory of trace class 
operators. In particular, the spectrum E(M) of M is a countable subset of R + U {0} with 

as its only accumulation point and each A G E(M), A ^ 0, is an eigenvalue of M. Zero 
belongs to the spectrum as an approximative rather than proper eigenvalue and, moreover, 
all eigenvalues of M are simple. Both these facts are easily checked by writing down the 
eigenequation Aa^ = [M(a)]fc which yields \(ak+i — a^) = (l/(k+ l) 2 — l/k 2 ) Yli=i a i-> k — 

1 - clearly the solution to this linear difference equation is unique up to multiplicative 
constant for all A and identically zero for A = 0. We set 

Moo := E ( 3 ) 

AGE(M)\{0} 

Our first result states that 

Theorem 1 Putin := [an\. Then, with probability one, the sequence of random measures 
Hn,m converges weakly away from to fi^ o («) _1 as n — > oo, where (cx)(x) = ax stands 
for the operation of multiplication by a. 

The problem with this theorem, apart from the fact that we are unable to explicitly de- 
termine E(M) and thus /i^, is that it is not robust with respect to small modifications of 
the dynamics, especially for low vertex ranks, which would have an immediate and non- 
negligible effect on the operator M and its spectrum. In particular, the technical issues 
discussed in the definition of the (WTA) dynamics above and related to the question how 
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to deal with self-transefers (to regard them charge-leak or charge-freezing events or perhaps 
to forbid them at all) do non-trivially impact the limit behaviour of the spectral measures 
Hn,m- This is an undesirable situation in our applications to neural nets in the set-up of 
Piekniewski & Schreiber (2008) where the local behaviour of recurrent neural networks is 
only approximately driven by the WTA dynamics and it is at the level of the large-scale 
global behaviour that we believe this approximation to yield reliable results. On the other 
hand, this is also an unavoidable situation in our present setting, because the spectrum of 
the spike-flow graphs is strongly affected by its few highest-degree vertices. 

To get more universal results we need to change somewhat our setting and to concen- 
trate on medium degree vertices, cutting off those of highest degree and obtaining theorems 
characterising the typical architecture of the spike-flow graph rather than the individual 
behaviour of its highest order elite which is highly sensitive to dynamic details. To this 
end, for e e (0, 1) consider the e-truncated charge-flow graph where all connections from 
and to vertices of rank between 1 and en are removed (with the downward flow direction 
these are the highest degree vertices). The resulting random connectivity matrix of this 
graph is denoted by A n ' m ' e . We are going to study the spectral measures 

n 

i=i 

where > A 2 > . . . are the eigenvalues of A n ' m ' € , which are clearly real because A n,m ' ,e 
is symmetric (note that at least \e n n~\ among these eigenvalues are due to the above 
cut-off). As already signalled above, this construction has a very natural interpretation in 
terms of large scale neural network modeling purposes in Piekniewski & Schreiber (2008) 
and Piersa, Piekniewski & Schreiber (2010) where the effective statistical structure of the 
charge flow graph is predominantly studied at the level of moderate and reasonably high 
but not highest elite units which are themselves considered from a somewhat different 
angle, see e.g. the discussion on competing basins of attraction of elite nodes in Section 
VILA, of Piersa, Piekniewski & Schreiber (2010) for further details. 

To proceed, consider the trace class integral operator K : L 2 ([l,oo)) — > L 2 ([l,oo)) 
given by 

™« = f (iW ds ' <*) 

As in case of M in (2) above, also here M is Hermitian positive as the covariance operator 
of t i — y Wi/ t 2, t > 1, and thus the required trace class property follows by Theorem 2.12 
in Simon (2005) because the trace integral l/t 2 dt converges, see also Example X.1.18 



7 



in Kato (1976). In particular, the spectrum of K consists of a countable set of isolated 
positive eigenvalues accumulating at 0. Zero belongs to the spectrum as an approximative 
rather than proper eigenvalue. In contrast to M here we are able to explicitly determine 
the spectrum of K though. 

Lemma 1 All eigenvalues of K are simple and strictly positive. Moreover, for A > we 
have 

A G E(K) Ji (^=^J = 
where Ji(-) is the Bessel J-function of order 1. 
We put 

Aes(i^)\{o} 

Choose a sequence (e n )^ =1 , in the sequel often required to satisfy 

lim ne n = +oo and there exists 5 > such that lim n l+5 e 2 n = 0. (7) 
Our second main result is 

Theorem 2 Put m := [an\ and let e n be as in (7). Then, with probability one, the 
sequence of random measures K e " m converges weakly away from to «oc ° as n — > oo. 

The interpretation of the first condition in (7) is rather clear in this context - we want the 
cut-off rank e n n to move towards +oo as n does. The second condition in (7) is perhaps less 
intuitive and its origin will be explained in the discussion following the proof of Theorem 
2. 

Upon inspecting its proof, Theorem 2 is easily seen to be insensitive to local dynamic 
modifications, such as these discussed following the formulation of Theorem 1, whose im- 
pact is only sensed by eigenvalues in close neighbourhoods of 0. This is an important good 
news from the viewpoint of our envisioned applications to large scale neural networks. 

We conclude this section by one further important remark. It is known, see (9.57) in 
Temme (1996), that k-ih zero of the Bessel function ,J\ is asymptotic to 1/4 + /OT as k — > oo. 
Consequently, by Theorem 2, the k-ih eigenvalue of K € ^ m asymptotically approaches 
for large k. This means that the spectral measures k% asymptotically reproduce the power 
law with exponent 2 as gouverning the degree distribution of the charge flow graph, see 
Piekniewski & Schreiber (2008). This is rather natural since the large eigenvalues of the 
considered adjacency graph are due to its large degree vertices. 
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3 Proofs 



3.1 Proof of Theorem 1 

The proof of our Theorem 1 uses the convergence of moments of spectral measures n n ,m 
which admit convenient representation as the traces of respective powers of the adjacency 
matrix of the considered charge flow graph. We put 

M Kn := / X k d^ m (X). (8) 

First we shall show that the desired convergence of moment expectations holds: 
Lemma 2 With the notation above we have for k > 1 

lim EM k , n = a k [ X k fi oa (dX). 

n^too J 

Next, applying appropriate measure concentration techniques, we will use Lemma 2 to 
show that 

Corollary 1 We have almost surely 

lim M Kn = a k X k fi 00 (dX). 



Finally, applying Corollary 1 we will complete the proof of Theorem 1 by standard argu- 
ment. 

Proof of Lemma 2 To calculate EM fc n we write first 

EM k>n = ETi([A n ' rn } k )/n k . (9) 

As already indicated in the construction of our standard coupling between the WTA dy- 
namics for different system sizes, we adopt the convenient convention of numbering from 1 
to m the charge units present in the system. Under this convention, whenever a transfer is 
made from vertex i to j, the number of unit to be transferred is chosen in some determin- 
istic way among the numbers ascribed to units stored at i, for instance the lowest /highest 
or the first/last arrived one. Consequently, recalling the dynamics of the system we get 
from (9) 



n k 



h=i i k =iu 1= i u k =i (10) 



9 



where T(Ui, Ui + i, k) stands for the event that the k-th charge unit was directly transferred 
between vertices C/j and U i+ i, either from C/j to U i+1 or in the opposite direction, in the 
course of the system evolution. To proceed, we split the RHS of (10) into a sum of two 
terms: 

• Sk given as the sum of the RHS terms of (10) for which all l^s are different, 

• Rk given as the sum of the remaining terms in the RHS of (10), that is to say these 
where at least two Zj's coincide. 

We evaluate Sk first, and then we show that R k is of a smaller order and thus asymptotically 
negligible. Since the sequences of vertices visited by different charge units on their way to 
1 are independent, we have 

n n 

s k = E • • • Yl p ( r ( f/i ' u *> l i))v(T(U2, u 3 -, h)) . . . nnuk, u i; i k )) = 

( i e{l,.-- ,m}, i=l,... ,k U\=l U k = l 

n n 

m(m-l)...(m-k+l)J2---Yl V ( T ( U ^ U ^ h)MT(U 2 , U 3 ; h)) . . . V(T(U k , U,; 1)), 

Ui=l u k =i (11) 

with the last equality due to the fact that the evolutions of all charge units coincide in law 
as following the same dynamic rules. To evaluate the probability of T(Ui, U i+1 ] 1) assume 
with no loss of generality that C/j+i < C/j. Then, since the number of the next vertex to be 
visited by a unit charge in the course of its WTA evolution is uniform among the numbers 
not exceeding the current vertex number, we have 

nnui, u i+i; i)) = ^-nniii- 1)), (12) 

where T(Uf, 1) is the event that 1-st charge unit has visited the vertex Ui on its way 
towards 1. Now, to find P(T(t/i; 1)) note that, by standard extreme value theory for record 
statistics as discussed e.g. in Subsection 4.1 in Resnick (1987), the sequence of different 
vertices V\ > V2, ■ ■ ■ visited by a charge unit coincides in law with the sequence 

[nexp(-77i)"|, [nexp(-?7 2 )],... , (13) 

where rji is the i-th consecutive point of a homogeneous Poisson point process of inten- 
sity 1 on IR + conditioned on not having more than one point in any of the intervals 
[— \og(U/n), — log((C/ — l)/n)), U G {1, . . . ,n} under the convention that logO = —00. 
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Consequently, P(T(£/i; 1)) coincides with the probability that some Poisson point rji falls 
into [-log(C/i/n), -log((C/i - l)/n)) which is 1 - exp(- log(lq/n) + log((E/i - l)/n)) = 
1 - ^ = 1/lq. Thus, we conclude from (12) that 



P(T(^,^ +1 ;1))= 1 (14) 



and hence, by (11), 



n n k -. 

U^i ^if=i^ v ^+i) 

with the convention that U k+ \ = U\. Further, we want to estimate the contribution brought 
by the extra term R k . We claim that 



k 



&<a+*<Hff)E-En^r 

f7i=l J7 fc =l i=l 1 l+L 



Ui V C/ m 



+ l/6(m) 



(16) 



Indeed, whenever = Zj, the events T(Ui, Ui+i; U) and T(Ui + i,Ui + 2',k+i) are no more 
independent and in fact can only co-occur if U i+ i lies between £/j and U i+2 , i.e. C/j < Z7j + i < 
U i+2 or Z7j > U i+ i > U i+2 , for otherwise one transfer would have two different sources or two 
different destinations. Thus, if we proceeded as in our derivation of (11) for S k , we would 
lose the factor corresponding to P(T(£/j+i; ^+i)) since T(U i+ i, = T(U i+ i, U). We 
would get 1 instead, but on the other hand we would lose the summation over which 
is now U. This means losing one of the k prefactors of order 6(m) as present in the RHS 
of (15) above or, equivalently, keeping summation over a dummy variable not to lose 
any prefactors, but with the lost factor j^— replaced by 1/0 (m) for each instance of 
This justifies (16) as required. Thus, recalling that m = [an\, using (10) and combining 
(15) and (16) we obtain 



k 



lim EM n k = a k V . . . V ff -, ttt (17) 

U 1= l U k =l i=l v 1 t+iJ 

with the convergence of the RHS series easily verified. Finally, recalling (2), using (17) and 
the trace class properties of M k yields 

lim EM n>k = a k Tr M k 

n— s-oo 

which completes the proof of Lemma 2 in view of the spectral measure definition (3). □ 
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Proof of Corollary 1 We begin by considering a modified version of our basic WTA dy- 
namics, which is better suited for an application of measure concentration results whereas 
with overwhelming probability its resulting charge-flow graph does coincide with the orig- 
inal winner-take-all network. The modification is that whenever on its way towards 1 a 
charge unit makes more than n 1 / 3 jumps, then it is forced to make its final jump directly 
to 1 rather than further following the usual dynamics. By our Poisson representation 
(13) of single charge unit evolution the number of jumps made on the way to 1 behaves 
asymptotically as mean logn Poisson random variable Po(logn). Consequently, the prob- 
ability that the number of jumps of an individual charge unit exceeds n 1 / 3 is not larger 
than exp (-^-iog(y/ 3 /2)) , see e.g. Shorack & Wellner (1986), p. 485. Thus, since the 
overall number of charge units is m = \_an\ , the probability that any individual charge 
unit makes more than n 1//3 jumps is still of order exp(— 0(n 1 ^ 3 logn)). Writing A n ' m for the 
adjacency matrix under the modified dynamics we have therefore 

P(i"' m ^ A n ' m ) < exp(-0(n 1/3 logn)). (18) 

To complete the proof we shall proceed by induction in k. Assume first that k = 1 and 
note that Tr(A n,m ) is a 1-Lipschitz function of A n,m under the /i-norm on R™ xr \ Consider 
now the operation of replacing the evolution of a single charge unit under the modified 
dynamics by some other evolution with at most n 1 ^ 3 jumps. Let B be the difference matrix 
between the new and the original adjacency matrices A n ' m . Clearly, B has at most 4n 1//3 
non-zero entries, all of which are ones or minus ones. Thus, such an operation may change 
Tr(A n ' m ) by at most 4n 1//3 and, consequently, Tr(A n,m /n) by at most 4n~ 2 / 3 . Recalling that 
A n,m is a function of the evolutions of m individual charge units wich are independent, and 
using standard measure concentration results for Lipschitz functions of independent entries, 
see Corollary 1.17 in Ledoux (2001), we conclude that 



P(| Tr(A n ' m /n) -ETr(A n ' m /n)\ > t) < 2 exp ( - " ) = exp(-0(t 2 n 1/3 )) 



t 2 



(19) 



because m = [an\. With t := l/(logn) relation (19) becomes 

P(| Tr(i n ' m /n) - ETr(i n ' m /n)| > l/(logn)) < exp(-6(n 1/3 (logn)- 2 )). (20) 
Combining (20) with (18) above yields now 

p(|Tr(A n ' m /n) -ETr(A n ' m /n)\ > l/(logra)) < exp(-0(n 1/3 (logn)- 2 )) 
whence the assertion of the corollary for k — 1 trivially follows by the Borel-Cantelli lemma. 
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To proceed with our inductive argument, for technical convenience we slightly extend 
our assertion for k > 2 and we show that both 

P(| Tr([i"' m /n] fe ) -ETr([i n ' m /n] fc )| > l/(logra)) < exp(-0(n 1/3 (logn)- 2 )) 

(21) 

and 

P(|Tr([abs(i n ' m )/n] fe ) - ETr([abs(i n ' m )/nf)| > l/(logn)) < exp(-6(n 1/3 (logn)- 2 )) 

(22) 

hold for all k > 2, with the absolute value matrix abs(A n,m ) understood here in the usual 
spectral sense (the same eigenvectors, eigenvalues replaced by absolute values). Assuming 
that (21) and (22) have already been established for k — 1 (unless k = 2 where we only 
assume (21) to hold) we define an auxiliary modified trace functional Trfc(-), k > 2, by 
putting for an n x n matrix A 

1. If k = 2 and 

Tr(A) < 2a k - 1 J X^dX) (23) 

then fr k (A) := Tr(A k ), 

2. If k > 3 and k is odd and 

Tr^- 1 ) < 2a fc_i y X^/i^dX) (24) 

then Tr k (A) := Tr(A k ), 

3. If k > 3 and k is even and 

Tr(abs(A) fe - 1 ) < 2a fc " 1 ^ A fc - 2 + X k ^{dX) (25) 

then Tr fc (A) := Tr(A fc ), 

4. Otherwise, define Tr fc (A) := Ti{A k ) where A is the metric projection of A onto the 
set = A[k, n, a, /j.^] given as 

(a) the set ofnxn matrices satisfying (23) if k = 2 (as in case 1.) 

(b) the set ofnxn matrices satisfying (24) if k > 3 and k is odd (as in case 2.) 

(c) the set ofnxn matrices satisfying (25) if k > 3 and k is even (as in case 3.) 
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Note that by Klein's lemma, see e.g. Lemma 6.4 in Guionnet (2009), the set A^ x 
is convex and closed and the matrix A is simply the matrix in A^ minimising the 
Euclidean distance to A in R nxn . 

This somewhat technical definition has a very simple interpretation: the modified trace 
functional Trfc(-) coincides with the usual trace of A k provided that the corresponding trace 
of abs(A) k ~ 1 is not too large, otherwise the modified trace is defined as the trace of A k 
where A is a version of the matrix A projected onto an appropriate convex set A^ so that 
the trace of its (k — l)-th power does not exceed the corresponding controllable threshold 
given by the RHS of (23,24) and (25) respectively, we denote this threshold by r[fc, a, fi^] 
for reference below. The extra auxiliary relation (22) is needed to ensure the convexity of 
A^ for k even. 

The further argument is quite standard now: the above modified trace functional coin- 
cides with the original one with overwhelming probability and at the same time it is well 
behaved as admitting well controllable oscillations and thus is suitable for usual measure 
concentration techniques. Indeed, using (18) and applying Lemma 2 combined with the 
observation that |A| fc_1 < \ k ~ 2 + \ k for k > 2 even, we conclude from (21) and (22) for 
k — 1 that 

P(Tr([i n '"7nf) ^ Tr k (A n ' m /n)) < exp(-0(n 1 / 3 (logn)- 2 )). (26) 

Recalling now that the derivative of A i— > Tr(A k ) in the direction of a matrix B is given 
by kTr(A k ~ 1 B) (see e.g. Lemma 6.1 in Guionnet (2009)), taking A := A n ' m and letting B 
be the evolution replacement difference matrix as above, we conclude by convexity of the 
projection set A^ and upon recalling that B has at most An 1 ^ non-zero entries, all plus 
or minus ones, that 

\fi k {[A n ' m + B\/n) - fi k ([A n ' m /n])\ < kn~Hn l ' Z T[k,a, /ij = G(n~ 2/3 ). 

Thus, using again that A n ' m is a function of the evolutions of m individual charge units 
wich are independent, and applying one more time Corollary 1.17 in Ledoux (2001), we 
obtain 

P(|fr fe (i n ' m /n) - ETr fc (i n ' m /n)| > l/(logn)) < exp(-G(n 1/3 (logn)- 2 )) (27) 

in full analogy to (20). When combined with (26) this yields the required relation (21) 
for k. The second inductive relation (22) follows in full analogy by using the fact that the 
derivative of A i— >■ Tr(abs(A fc )) in the direction of a matrix B is k Ti(abs(A) k ^ 1 B) for k > 2, 
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see again e.g. Lemma 6.1 in Guionnet (2009). This completes the inductive argument and 
shows that both (21) and (22) hold for all k>2. 

Finally, putting (21) together with (18) we come to 



for all k > 1, which completes the proof of Corollary 1 by a straightforward application of 
the Borel-Cantelli lemma. □ 

Completing the proof of Theorem 1 Having established Corollary 1 we readily com- 
plete the proof of Theorem 1 using that the trace class operator M has in particular a 
finite spectral radius and resorting to the standard method of moments and classical Car- 
leman's criterion, see e.g. Shohat & Tamarkin (1943), p. 19, applied for the measures 
fi' n m (d\) := \ji n ^ m {d\) whose sequence of moments coincides with that of n n ,m shifted by 
one - this way we conclude that a.s. fi' n rn converges weakly to fj,'^ with ^(dX) = A// 00 (dA) 
whence the desired a.s. weak convergence of ji n ^ m to ji^ away from zero follows. □ 

Remarks An intuitive explanation of Theorem 1 can be provided by noting that, in view 
of (14), we have 



and the fluctuations of A™j m are easily controllable as coming from independent evolutions 
of m charge units. Consequently, A n,m /n a.s. converges to aM in many reasonably strong 
senses provided by the operator theory and thus /ioo o (a) -1 is a natural candidate for the 
limit of spectral measures // n , m . This could be a starting point for an alternative proof 
of Theorem 1, but presumably much more complicated than ours as requiring the use of 
measure concentration tools in Banach space of linear operators endowed with the trace 
class norm, and then quite involved and technical additional considerations relating the 
convergence of operators to spectral measure convergence. In this context, we strongly 
prefer the method of moments as letting us avoid unnecessary technicalities. 



As in the proof of Theorem 1 also here we use the convergence of moments. With m = \_an\ 
we put 



P(| Tr([A n ' m /n} k ) - ETr([A n > m /n} k )\ > l/(logn)) < exp(-9(n 



^(logn)- 2 )) 



(28) 



3.2 Proof of Theorem 2 




(29) 
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We shall establish the following covergence of expectations first. 

Lemma 3 With e n such that lim^oo e n n = +00 and lim^oo e n = we have for k > 1 

lim EM £ k n n = a k [ X k K QO (dX). 

In analogy to the corresponding step in the proof of Theorem 1, also here the convergence 
of expectations will be strengthened to a.s. convergence using measure concentration. 

Corollary 2 Assume that the sequence e n satisfies (7). Then we have almost surely 

lim M? n = a k [ AVoo(dA). 

n— >oo ' J 

Note that, unlike in Lemma 3, in Corollary 2 we do require the full strength of (7). This 
corollary will lead us to the desired assertion of Theorem 2 by a standard argument. 



Proof of Lemma 3 To calculate EM^" n write 

EM£ n = e k n ETr{[A n ' m;en ] k ). 

In full analogy with the corresponding argument leading to (15) and (16) in the proof of 
Theorem 1 above, we obtain 

uvx E ••• E tl wvu- i)2 < em m< 

U! = le n n] U k = \e n n\ i=l V % l+LJ 
n k 



(30) 



U\ = \e n n\ U k = \e n n \ i 

Consequently, as n — > 00, we have in view of (30) 



Ui V u l+l 



+ l/6(m) 



('-(^^ e ... e nfe 



Substituting -Uj := Ui/(e n n), recognising appropriate integral sums in the RHS and recalling 
that m = [an\ we get therefore by our assumptions on e n 

lim EM^" = a fc / . . . / ]T — - -d Ul . . . du k . 

' J 1 J 1 f = \ V u i+1 ) 2 

Recalling the definition (5) of K and the trace class properties of K k this yields 

lim EM e " = a k TrK k . 

This completes the proof of Lemma 3 in view of the spectral measure definition (6). □ 



16 



Proof of Corollary 2 Our argument here goes very much along the same lines as the 
proof of Corollary 1. Note first that Lemma 3 is applicable under the assumptions of 
Corollary 2 because (7) does in particular imply the conditions on e n imposed in the 
statement of the lemma. Again, we consider a modified version of the WTA dynamics, the 
modification being that whenever on its way towards 1 a charge unit makes more than n 5 ^ 3 
jumps, then it is forced to make its final jump directly to 1 rather than further following 
the usual dynamics. Recall that 5 is determined by (7) as assumed in the statement of the 
corollary. Writing again A n,m for the adjacency matrix under the modified dynamics we 
have in full analogy with (18) 

P(i"' m ^ A n ' m ) < exp(-6(n <5/3 logra)). (31) 

In analogy to the proof of Corollary 1, also here we consider the operation of replacing 
the evolution of a single charge unit under the modified dynamics by some other evolution 
with at most n 5 ^ 3 jumps. Denoting by B be the difference matrix between the new and 
the original adjacency matrices A n ' m we see that B has at most 4n 5//3 non-zero entries, all 
of which are ones or minus ones. This observation puts us again in a position to apply 
measure concentration results for Lipschitz functionals with respect to product measures, 
nearly verbatim following the respective lines of the inductive argument for Corollary 1. 
Note that in our present set-up the modified trace functional Tr^ involves projections onto 
the convex set A Koo defined in full analogy to the corresponding A^. Moreover, A n,m /n 
in the proof of Corollary 1 is replaced by e n A n,m here due to the different scaling. This 
way, in analogy to (21), we conclude that 

P(|Tr([e n i n '"f) -ETr([e n i n ' m f)| > l/(logra)) < exp(-0(n 5/3 (logn)- 2 )) 

(32) 

for all k > 1. Using (31) we get 

P(|Tr([e n A n ' m ] fc ) -ETr([e n A a > m } k )\ > l/(logn)) < exp(-0(n 5 / 3 (logn)^ 2 )) 

in analogy to (28), whence the assertion Corollary 2 follows by the Borel-Cantelli lemma. 
□ 

Completing the proof of Theorem 2 Since the trace class operator K has in particular 
a finite spectral radius, the desired assertion of Theorem 2 follows now readily in view of 
Corollary 2 by the standard method of moments and Carleman's criterion, see e.g. p. 
19 in Shohat & Tamarkin (1943), used in analogy to the corresponding proof-completing 
paragraph for Theorem 1. □ 
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Justification of condition (7) We note at this point that, intuitively speaking, the 
independent contributions to the random matrix A n,m brought by each of the m evolving 
charge units should bring respective variance contributions to the trace Tr([e n A n ' m ] k ) of 
the order logn per unit (logn is the order of number of unit charge jumps before leaking 
out from the system), which sums up to order 9(mlogne^) = 6(nlogne^) upon taking all 
units into account. Therefore it is natural to require that ne^ converges to faster than 
l/(logn), which is roughly the content of the second condition in (7), for otherwise we 
should not hope for a deterministic limit of Tr([e n A n,m ] fc ) and thus of fi n ,m;t n as n — > oo. 
This informal observation should be regarded as a justification for (7) rather than as a 
mathematical statement though. 

3.3 Proof of Lemma 1 

Assume that e L 2 ([l,oo)) is a non-zero eigenfunction of the operator K corresponding 
to some eigenvalue A > 0. The corresponding eigenequation reads 

= )if ^ s ) ds + ^ s )ds- (33) 

Since the RHS is an application of the integral operator with a well-behaved kernel, both 
sides are readily seen to be differentiable and the differentiation yields 

A0'(t) = i#t) - i#f) - ! jT t(s)ds = ~ <f>(s)ds. 

Putting 

tt(i) := J <p(s)ds 

we get the differential equation 

A*"(i) = (34) 

with the initial condition 

(1) = 0. (35) 

The solution to this equation is \& = for A = which shows that is not an eigenvalue 
and, for A 7^ 0, 

*(f) = Vt ( d Jx ( ^ 1 + <hY x ( ^ 1 1 , (36) 
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where J\ and Y 1 are, respectively, the Bessel J- and Y-functions of order 1 (Bessel first and 
second kind functions respectively) and Ci,C 2 are general constants. Differentiating for 
A^O we come to 

Recall now that, in small h > asymptotics, J\(h) ~ /i/2, Jq{H) ~ 1, Fi(/i) ~ — ^ and 
io(^) ~ f log/i, see e.g. Section 9.4 in Temme (1996). By (37), for large t > this readily 
yields <f>[t) = C^l/t) + C 2 6(1). Likewise, by (36), tf(t) = CiO(l) +C 2 6(t). Consequently, 
since G -^([l, oo)), we must have C 2 = 0. In view of (36) and (35) this is only possible 
when 

*($)-«. (38, 

Thus, all eigenvalues of K are positive real numbers satisfying (38). Moreover, they are all 
simple since, under (38) and with C 2 = the solution of (34) and (35) is unique up to a 
multiplicative constant. It remains to check that each A > satisfying (38) is an eigenvalue 
of K. To this end it is enough to recall the eigenequation (33) and observe that its LHS 
is A0(t) = o(l/t) and converges to in large t asymptotics and so does the RHS which is 
asymptotic to ^>{i)/t 2 = 0(l/t 2 ) as t — > oo. This completes the proof of Lemma I. □ 
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